22 research outputs found
Overcoming Exploration in Reinforcement Learning with Demonstrations
Exploration in environments with sparse rewards has been a persistent problem
in reinforcement learning (RL). Many tasks are natural to specify with a sparse
reward, and manually shaping a reward function can result in suboptimal
performance. However, finding a non-zero reward is exponentially more difficult
with increasing task horizon or action dimensionality. This puts many
real-world tasks out of practical reach of RL methods. In this work, we use
demonstrations to overcome the exploration problem and successfully learn to
perform long-horizon, multi-step robotics tasks with continuous control such as
stacking blocks with a robot arm. Our method, which builds on top of Deep
Deterministic Policy Gradients and Hindsight Experience Replay, provides an
order of magnitude of speedup over RL on simulated robotics tasks. It is simple
to implement and makes only the additional assumption that we can collect a
small set of demonstrations. Furthermore, our method is able to solve tasks not
solvable by either RL or behavior cloning alone, and often ends up
outperforming the demonstrator policy.Comment: 8 pages, ICRA 201
Domain Randomization and Generative Models for Robotic Grasping
Deep learning-based robotic grasping has made significant progress thanks to
algorithmic improvements and increased data availability. However,
state-of-the-art models are often trained on as few as hundreds or thousands of
unique object instances, and as a result generalization can be a challenge.
In this work, we explore a novel data generation pipeline for training a deep
neural network to perform grasp planning that applies the idea of domain
randomization to object synthesis. We generate millions of unique, unrealistic
procedurally generated objects, and train a deep neural network to perform
grasp planning on these objects.
Since the distribution of successful grasps for a given object can be highly
multimodal, we propose an autoregressive grasp planning model that maps sensor
inputs of a scene to a probability distribution over possible grasps. This
model allows us to sample grasps efficiently at test time (or avoid sampling
entirely).
We evaluate our model architecture and data generation pipeline in simulation
and the real world. We find we can achieve a 90% success rate on previously
unseen realistic objects at test time in simulation despite having only been
trained on random objects. We also demonstrate an 80% success rate on
real-world grasp attempts despite having only been trained on random simulated
objects.Comment: 8 pages, 11 figures. Submitted to 2018 IEEE/RSJ International
Conference on Intelligent Robots and Systems (IROS 2018
The Identification of Individuals with Disabilities in National Databases: Creating a Failure to Communicate
The purpose of this study was to analyze similarities and differences in how students with disabilities are identified in national databases. National data collection programs in the U.S. Departments of Education, Commerce, Labor, Justice, and Health and Human Services, as well as databases from the National Science Foundation, the American Council of Education, and the College Board, were examined. Nineteen national data collection programs were selected as being potentially useful in the extraction of policy-relevant information on the educational status and performance of students with disabilities. Among these 19 programs there was significant variability in the disability catego-ries used. These programs were targeted for two reasons: (a) their potential usefulness in providing indicators of domains in key models of educational outcomes for children and youth with disabilities, and (b) their prominence in current efforts to monitor progress toward the attainment of national education goals. Discussed are issues related to improving disability identification in large-scale data collection programs and the effects of these issues on reporting policy-relevant information
The Identification of People With Disabilities in National Databases: A Failure to Communicate (NCEO Synthesis Report)
A report summarizing findings for policymakers, researchers, and educators that focuses on assessment, accommodations, and accountability in relation to K-12 students with disabilities.The Center is supported through a cooperative agreement with the U.S. Department of Education, Office of Special Education Programs (1990-1995: H159C00004; 1995-2000: H159C50004). Opinions or points of view expressed within this document do not necessarily represent those of the U.S. Department of Education or Offices within it